Search CORE

201 research outputs found

Cross-Entropy Clustering

Author: Spurek Przemysław
Tabor Jacek
Publication venue: 'Elsevier BV'
Publication date: 11/12/2012
Field of study

We construct a cross-entropy clustering (CEC) theory which finds the optimal number of clusters by automatically removing groups which carry no information. Moreover, our theory gives simple and efficient criterion to verify cluster validity. Although CEC can be build on an arbitrary family of densities, in the most important case of Gaussian CEC: {\em -- the division into clusters is affine invariant; -- the clustering will have the tendency to divide the data into ellipsoid-type shapes; -- the approach is computationally efficient as we can apply Hartigan approach.} We study also with particular attention clustering based on the Spherical Gaussian densities and that of Gaussian densities with covariance s \I. In the letter case we show that with

s

converging to zero we obtain the classical k-means clustering

arXiv.org e-Print Archive

CiteSeerX

Extreme Entropy Machines: Robust information theoretic classification

Author: Czarnecki Wojciech Marian
Tabor Jacek
Publication venue
Publication date: 21/01/2015
Field of study

Most of the existing classification methods are aimed at minimization of empirical risk (through some simple point-based error measured with loss function) with added regularization. We propose to approach this problem in a more information theoretic way by investigating applicability of entropy measures as a classification model objective function. We focus on quadratic Renyi's entropy and connected Cauchy-Schwarz Divergence which leads to the construction of Extreme Entropy Machines (EEM). The main contribution of this paper is proposing a model based on the information theoretic concepts which on the one hand shows new, entropic perspective on known linear classifiers and on the other leads to a construction of very robust method competetitive with the state of the art non-information theoretic ones (including Support Vector Machines and Extreme Learning Machines). Evaluation on numerous problems spanning from small, simple ones from UCI repository to the large (hundreads of thousands of samples) extremely unbalanced (up to 100:1 classes' ratios) datasets shows wide applicability of the EEM in real life problems and that it scales well

arXiv.org e-Print Archive

Springer - Publisher Connector

Paraconvex, but not strongly, Takagi functions

Author: Tabor Jacek
Tabor Józef
Publication venue
Publication date: 01/01/2012
Field of study

There is an important open problem in the theory of approximate convexity whether every paraconvex function on a bounded interval is strongly paraconvex. Our aim is to show that this is not the case. To do this we need the following generalization of Takagi function. For a sequence a = (ai)i∈N ⊂ R+ we consider Takagi-like function of the form T(a)(x) := ∑ ∞ i=1 aidist(x, 12i-1Z) for x ∈ R. We give convenient conditions for verification whether T(a) is paraconvex or strongly paraconvex. This enables us to construct a class of paraconvex functions which are not strongly paraconvex

CiteSeerX

Biblioteka Nauki - repozytorium artykuÅÃ³w

Jagiellonian Univeristy Repository

LOSSGRAD: automatic learning rate in gradient descent

Author: Maziarka Łukasz
Tabor Jacek
Wójcik Bartosz
Publication venue: 'Uniwersytet Jagiellonski - Wydawnictwo Uniwersytetu Jagiellonskiego'
Publication date: 01/01/2018
Field of study

In this paper, we propose a simple, fast and easy to implement algorithm LOSSGRAD (locally optimal step-size in gradient descent), which automatically modifies the step-size in gradient descent during neural networks training. Given a function

f

, a point

x

, and the gradient

\nabla_x f

f

, we aim to find the step-size

h

which is (locally) optimal, i.e. satisfies:

h=arg\,min_{t \geq 0} f(x-t \nabla_x f).

Making use of quadratic approximation, we show that the algorithm satisfies the above assumption. We experimentally show that our method is insensitive to the choice of initial learning rate while achieving results comparable to other methods.Comment: TFML 201

arXiv.org e-Print Archive

Jagiellonian Univeristy Repository